Neural Network Methods for Boundary Value 
Problems Denned in Arbitrarily Shaped Domains 

bo 

Q\ \ I. E. Lagaris* 

School of Nuclear Engineering, Purdue University 
West Lafayette, INDIANA 47907 - USA 

Q ; A. Likas 

^ ■ Department of Computer Science, University of Ioannina 

45110 Ioannina - GREECE 

D. G. Papageorgiou 
Network Operations Center, University of Ioannina 
O '. 45110 Ioannina - GREECE 



> 

■ Abstract 

o ■ 

■ Partial differential equations (PDEs) with Dirichlet boundary condi- 
s ! ' tions defined on boundaries with simple geomerty have been succesfuly 

treated using sigmoidal multilayer perceptrons in previous works |], ||. 
This article deals with the case of complex boundary geometry, where the 
boundary is determined by a number of points that belong to it and are 
closely located, so as to offer a reasonable representation. Two networks 
are employed: a multilayer perceptron and a radial basis function net- 
work. The later is used to account for the satisfaction of the boundary 
• conditions. The method has been succesfuly tested on two-dimensional 

?H ' and three-dimensional PDEs and has yielded accurate solutions. 
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1 Introduction 

Neural Networks have been employed before to solve boundary and initial value 

problems as well as eigenvalue problems [|[. The cases treated in the above 

mentioned articles were for simple hnite or extended to infinity orthogonal box 

boundaries. However when one deals with realistic problems, as for instance 

in modelling the human head-neck system || or the flow and mass transfer 

in chemical vapor deposition reactors Q, the boundary is highly irregular and 

cannot be described in terms of simple geometrical shapes, that in turn would 

have allowed for a simple modelling scheme. 

'Permanent address: Dept. of Computer Science, University of Ioannina, 45110 Ioannina 
- GREECE 
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In this article we propose a method capable of dealing with such kind of 
arbitrarily shaped boundaries. As before Q our approach is based on the 
use of feedforward artificial neural networks (ANNs) whose approximation ca- 
pabilities have been widely aknowledged |7|, |8|]. More specifically, the proposed 
approach is based on the synergy of two feedforward ANNs of different types: a 
multilayer perceptron (MLP) as the basic approximation element and a radial 
basis function (RBF) network for satisfying the BCs, at the selected boundary 
points. In addition, our approach relies on the availability of efficient software 
for multidimensional minimization || that is used for adjusting the parameters 
of the networks. 

A solution to differential equation problems based on ANNs exhibits several 
desirable features: 

• Differentiable closed analytic form. 

• Superior interpolation properties. 

• Small number of parameters. 

• Implementable on existing specialized hardware (neuroprocessors). 

• Also efficiently implementable on parallel computers. 

In the next section we describe the proposed method and derive useful for- 
mulas, while in section 3, we discuss implementation procedures and numerical 
techniques. In section 4 we illustrate the method by means of examples and 
we compare our results to analytically known ones. Finally section 5 contains 
conclusions and directions for future research. 

2 Description of the method 

We will examine PDEs of the form 

L* = f (1) 

where L is a differential operator and ^ = ^(x) (i£Dc R^), with Dirichlet 
B.C.s, i.e. with ^ being specified on the boundary B = dD. The boundary can 
be any arbitrarily complex geometrical shape. We consider that the boundary is 
defined as a set of points that are chosen so as to represent its shape reasonably 
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accurate. Suppose that M points R\, R2, ■ ■ ■ , Rm £ B are chosen to represent 
the boundary and hence the boundary conditions are given by: 

*(Ri) = bi, i = l,2,...,M (2) 

To obtain a solution to the above differential equation, the collocation 
method || is adopted which assumes the discretization of the domain D into a 
set of points D (these points are denoted by n, i = 1, . . . , K). The problem is 
then transformed into the following system of equations: 

L*(n) = /(ri),Vr< G D, and V(Ri) = h, \/R t G B (3) 

Let ^ m{x,p) denote a trial solution to the above problem where p stands 
for a set of model parameters to be adjusted. In this way, the problem is 
transformed into the following constrained minimization problem: 

K 

min p E(p) = J2( L ^M(ri,p) - /(r^)) 2 (4) 

subject to the constraints imposed by the B.Cs 

y M (Ri,p) = bi, i = l,2,...,M (5) 

The above constrained optimization problem may be tackled in a number of 
ways. 

1. Either devise a model ^m(^,p), such that the constraints are exactly 
satisfied by construction and hence use unconstrained optimization tech- 
niques, 

2. Or, use a suitable constrained optimization method for non-linear con- 
straints. For instance: Lagrange Multipliers, Active Set methods or a 
Penalty Function approach. 

A model suitable for the first approach is a synergy of two feedforward neural 
networks of different type, and it can be written as: 

M 

* M (x,p) = N(x,p) + J2qie- Xlx - Rl12 (6) 
1=1 

where N(x,p) is a multilayer perceptron (MLP) with the weights and biases 
collectively denoted by the vector p. The sum in the above equation is an RBF 
network with M hidden units that all share a common exponential factor A. 
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For a given set of MLP parameters p, the coefficients qi are uniquely deter- 
mined by requiring that the boundary conditions are satisfied, ie: 

M 

b l -N{R uP )=Y j qie- X ^- Rl? i = l,2,...,M (7) 
1=1 

Namely one has to solve a linear system, Aq = c, where A^j = e -H R i- R j\ anc [ 
Ci = bi- N(Ri,p) where i,j = 1, . . . ,M. 

We consider now a penalty function method to solve the constrained opti- 
mization problem. The model in this case is simply ^m(x,p) = N(x,p). The 
error function to be minimized is now given by: 

K M 
E(p,T 1 )=Y / (LN(r l ,p)-f(r l )) 2 + V Y / (*M(R i )-b t ) 2 (8) 
i=i i=i 

where the penalty factor 77, takes on higher and higher positive values depending 
on how accurately the BCs are to be satisfied. 

The MLP-RBF synergy satisfies exactly the BCs but it is slow. At every 
evaluation of the model one needs to solve a linear system which may be quite 
large, depending on the problem. Also since many efficient optimization meth- 
ods need the gradient of the error function, one has to solve for each gradient 
component an aditional linear system of the same order. This makes the process 
computationally intensive. On the other hand, the penalty method is very effi- 
cient, however satisfies the BCs approximately only. In practice a combination 
of these two methods may be used profitably in the following manner. 

• Use the penalty method to obtain a reasonable model that satisfies to 
some extend the BCs. 

• Improve the model, using for a few iterations the synergy method, that 
will in addition satisfy the BCs exactly. 

We used the above combination in all of our experiments and our results are 
quite encouraging. 

3 Implementation and Numerical techniques 

The MLPs we have considered contain one hidden layer with sigmoidal hidden 
units and a linear output that is computed as: 

H n 

N(x,p) = v i°(Yl w ij x j + u i) ( 9 ) 
i=l j=\ 
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where n is the number of input units, H is the number of the hidden units and 
a(z) = [l+e-T 1 - 

In order to minimize the error E(p), optimization techniques are employed 
that require the computation of the derivatives ^ and, consequently, the 
derivatives which are listed below: 

op 

dp dp ^ dp 

Since qi = Ei=i,M A Ti~{h ~ N (RiiP)) we g et: 

dqi = j^^ dNjR^p) 
dp dp 

i.e. one has to solve as many M x M linear systems as the number of the 
parameters p. Derivatives of the MLP with respect to either the parameters p 
or the input variables can be easily derived and are given in 

In order to apply the proposed method, first the value of A must be specified 
that defines the linear system (matrix A). In our experiments the linear system 
was solved using standard Choleski decomposition for the matrix A. We did 
not use special methods for sparse linear systems nor any parallel programming 
techniques. 

For large values of A the Gaussian terms in the RBF are all highly localized 
so that affect the model only in the neighborhood of the boundary points. In 
other words the RBF contributes a "correction" to account for the BCs. For 
small values of A, the matrix looses rank and becomes singular. So A must 
be selected with caution. A good choice is found to be: A « Jj, where a 
is the minimum distance between any two points on the boundary, ie: a = 
miriij[\Ri — Rj\], where i,j = 1, .2, • • • , M. Note that different A's may also be 
used instead of a common one in equation (6). In that case the corresponding 
Oj would be the distance of the closest boundary neighbouri to point Rj, i.e. 
dj = mirii[\Ri — Rj\], where % = 1, .2, • • • , M. However a common A leads to a 
symmetric matrix A that in turn renders the linear system easier to solve. 

Training of the MLP network so as to minimize the error of eq. (4) can 
be accomblished using any minimization procedure such as gradient descent 
(backpropagation or any of its variants), conjugate gradient, Newton methods 
etc. Many effective minimization techniques are provided by the Merlin/MCL 
multidimensional optimization system ||, |6| which has been employed in our 
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experiments. It has been earlier demonstrated |T^, [TlJ], that an improvement in 
the generalization of the neural model is achieved, if the sigmoidal parameters 
are kept inside a limited range such that the exponentials do not loose precision. 
Hence box-constrained optimization techniques should be used to guarantee 
the above requirement. From the variety of the minimization methods offered 
by the Merlin optimization environment, the (quadratically convergent) BFGS 



method [12] seemed to have the best performance. 

When solving problems requiring several hundreds of boundary points (and 
thousands of domain points) the method may become relatively slow. There 
are several techniques that may be applied in order to accelerate the process. 
The linear systems are sparse and hence one can employ iterative sparse solvers 
instead of the Choleski factorization method that we used here. When com- 
puting the gradient of the error function, one has to solve many linear systems 
with identical left hand sides and hence one may use special methods that 



currently are under investigation and development [ 13 [ . Parallel programming 
techniques for machines with many cpus are also applicable. The most efficient 
implementation however would be one that will utilize specialized hardware 
(neuroprocessors) . 

We describe now the strategy followed in detail. 

1. At first we use the efficient penalty function approach (with r] = 100 in 
all tests) to obtain an MLP network that approximates the solution both 
inside the domain and on the boundary. 

2. Then we switch to the MLP-RBF method with initial parameter values 
for the MLP network those obtained from the penalty method. Therefore 
the MLP-RBF method starts from a low error value and requires only a 
few minimization steps in order to reach a solution of even lower error 
value which in addition satisfies the BCs exactly. 

4 Examples 

4.1 Two Dimensional Problems 
Problem 1: Consider the problem: 

V 2 *(x,y) = e- x (x-2 + y 3 + 6y), x,ye[0,l] (11) 
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Figure 1: Exact solution of Problem 1 

with boundary conditions: 

(0, y) = y\ *(l,y) = *(x, 0) = xe'*, *(x, 1) = e^(l + x) 

e 

The analytic solution is: ^ (a;, y) = e~ x (x+y i ). This example has also been 
treated in Here the problem is treated by picking points on the boundary 
as if it were any arbitrary shape. More specifically, we take the following points 
(x, y) on the boundary, where m x and m y denote the number of points which 
divide the interval [0, 1] on the x-axis and y-axis respectively and 5 X = l/(m x — 
1), 5 y = l/(m y - 1): 

((i - 1)4,0), i = l,...,m x -l 

((i - 1)S X , 1), i = 1, . . . ,m x - 1 

(0, (i - l)5 y ), i = l,...,m y -l 

(l,(i-l)8 y ), i = l,...,m y -l 

After this selection, a test is made to remove duplicates, which in this case are 
the points at the corners of the rectangle boundary. In our experiments we have 
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Figure 2: Accuracy of obtained solution to problem 1 



considered m x = m y = m = 10 and, therefore, the total number of points taken 
on the boundary is M = 4(m — 1) = 36. For the points inside the definition 
domain we pick points on a rectangular grid by subdividing the [0, 1] interval 
in 10 equal subintervals that correspond to 9 points in each direction. Thus a 
total of K = 81 points are selected. The analytic solution is presented in Fig. 
1, while the accuracy \^m(x, y) — ^ a (x, y) | of the obtained solution is presented 
in Fig. 2. In all two-dimensional examples we used an MLP with 20 hidden 
units. 

Problem 2: 

The same problem is solved with the boundary being the first quarter of 
the unit circle. The solution domain is defined as the rectangle [0, 1] x [0, tv/2] 
on the polar coordinates (r, </>). To obtain the boundary points (x,y) we first 
defined the boundary points (r, <f) in the polar coordinates (according to the 
procedure of the previous problem) and then we computed the (x, y) values: 
x = rcoscj), y = rsincp. We have used M = 37 boundary points and K = 81 
grid points. The exact solution and the accuracy of the obtained solution are 
displayed in Fig. 3 and 4 in the (r, <p) coordinates. 
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Figure 3: Exact solution of problem 2 




Figure 4: Accuracy of obtained solution to problem 2 
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Figure 5: Exact solution of problem 3. 

Problem 3: 

Finally we solved eq. (11) when the boundary is the unit circle. The solution 
domain is defined as the rectangle [0, 1] x [0, 2ir] on the polar coordinates (r, 4>). 
The boundary points (x,y) are defined as x = rcosfii, y = rsinc^i, where 
4>i = 2-iri/m, i = 1, ... ,m — 1. We have used M = 20 boundary points and 
K = 153 grid points. The analytic solution and the accuracy of the obtained 
solution are displayed in Fig. 5 and 6 respectively in the (r, 4>) coordinates. 

4.2 Three Dimensional Problems 
Problem 4: Consider the problem: 

\7 2 ^(x,y,z) = e x y 2 + z 2 siny, x,y,z G [0, 1] (12) 

with analytic solution: ^ a (x,y) = e x y 2 + (z 2 — 2)siny known at the boundary. 

Similarly with the approach described in Problem 1, we define the boundary 
points by dividing the [0, 1] interval along the x-axis, y-axis, z-axis with m x = 
m y = m z = m = 7 points respectively and taking the points (x, y, z): 

({i - 1)5, (i - 1)5,0), i = l,...,m-l 
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Figure 6: Accuracy of obtained solution to problem 3. 




Figure 7: Exact solution of problem 4 for x = 0.5. 
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Figure 8: Accuracy of obtained solution to problem 4 for x = 0.5. 



((i _ 1)5, (i- l)S,l), i = l,... 



m — 1 



((i- 1)<5, 0, (i - 1)5), i = l,... 



m — 1 



((i - 1)5,1, (i- 1)6), i 



= 1, ... ,171 — 1 



(0,(i-l)6,(i-l)6), i = l,..., 



m — 1 



(1, (i — l)<y, (* — l)<y), i = l,...,m-l 



where 6 = l/(m — 1). 

After this specification a test is made to remove duplicates or points that 
were very close to another point, and the final number of boundary points 
was M = 218. For the points inside the definition domain we pick points on 
a rectangular grid subdividing the [0, 1] interval in 10 equal subintervals that 
correspond to 9 points in each direction defining a total of K = 729 points. 
The analytic solution for (x = 0.5) is presented in Fig. 7, while the accuracy 
|^m(0.5, y, z) — * a (0.5, y, z)\ of the obtained solution is presented in Fig. 8. In 
all three-dimensional examples we used an MLP with 40 hidden units. 

Problem 5: We considered the previous problem: 



V 2 ^(x, y, z) = e x y 2 + z 2 siny 



(13) 
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Figure 9: Exact solution of problem 5 for r = 0.75. 

on the domain [0.5, 1] x [0,7r/2] x [0, ir/2] on the spherical coordinates (r,(f>,6). 
Similarly with the approach described in Problem 4, we define the boundary 
points by dividing the intervals r-axis, <^>-axis, 6>-axis with m = 7 points respec- 
tively and taking the points (r, 0, 9): 

((i - l)6 r , (i - 1)^,0), z = l,...,m-l 
{(i - l)S r , (i - 1)5^, 7r/2), i = 1, . . . , m - 1 

((i- 1)^,0,(1-1)^), i = l,...,m-l 
((i- 1)^,^/2,(1-1)^), i = l,...,m-l 
(0.5, (i - 1)^, (i - l)5e), i = 1, • • • , m - 1 

(l,(i- 1)5,(1-1)^), i = l,...,m-l 

where <5 r = 0.5/(m — 1), ^ = 0.5/ (m — 1), (5e = 0.5/(m — 1). From the 
(r, (f>, 6) values we obtained the corresponding (x, y, z) points using the well- 
known transformation. 

After this specification a test is made to remove duplicates or points that 
were very close to another point, and the final number of boundary points 
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Figure 10: Accuracy of obtained solution to problem 5 for r = 0.75. 

was M = 176. For the points inside the definition domain we pick points 
(r,6,(j)) on a rectangular grid subdividing the [0,0.5], [0, tt/2] intervals in 10 
equal subintervals that correspond to 9 points in each direction defining a total 
of K = 729 points (r,4>,9). Then the (x,y,z) points were obtained from the 
(r, 9, 4>) points. The exact solution and accuracy of the obtained solution arc 
displayed in Fig. 9 and 10 in the (4>, 9) coordinates for r = 0.75. 

5 Conclusions 

We have presented a method for solving differential equations with Dirichlet 
BCs, where the boundary can be of arbitrary shape and is discretized to obtain 
a set of boundary points. The method is based on the synergy of MLP and RBF 
artificial neural networks and provides accurate and differentiable solutions (in 
a closed analytic form) that exactly satisfy the BCs at the selected boundary 
points. Moroeover it is possible to implement the method on specialized hard- 
ware (neuroprocessors) to significantly improve the required solution time. The 
proposed method is quite general and can be used for a wide class of PDEs with 
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Dirichlet BCs, regardless of the shape of the boundary. The only requirement 
is that enough boundary points are selected so as to represent the boundary 
shape with sufficient accuracy. 

Future work will focus on the application of the method to real-world 3-D 
problems, containing surfaces of real objects with arbitrary shapes as bound- 
aries. Interesting problems of this kind arise in many scientific fields. We have 
a strong interest in implementing the method on both, general purpose parallel 
hardware and on neuroprocessors. The later would reveal the full potential of 
the proposed approach and would lead to the development of specialized ma- 
chines, that will allow the treatment of difficult and computationally demanding 
problems. 

Ackowledgement: One of us (I. E. L) wishes to acknowledge the warm 
hospitality offered by professors Ishii and Tsoukalas of Purdue University, at 
the school of Nuclear Engineering, during his stay at Lafayette. 

References 

[1] I. E. Lagaris, A. Likas and D. I. Fotiadis, Artificial Neural Networks for 
Solving Ordinary and Partial Differential Equations, Preprint 15-96, Dept. 
of Computer Science, University of Ioannina(1996) Obtainable via anony- 
mous ftp from zeus.cs.uoi.gr, file pub/PAPERS/ODE_PDE/ode_pde.ps 

[2] I. E. Lagaris, A. Likas and D. I. Fotiadis, Artificial Neural Network Methods 
in Quantum Mechanics, Computer Physics Communcations, vol. 104, pp. 
1-14, 1997. 

[3] A. Charalambopoulos, G. Dassios, D. I. Fotiadis and C. Massalas, Fre- 
quency Spectrum of the Human-Neck System, Int. J. Eng. Sci., vol. 35, no. 
8, pp. 753-768, 1997. 

[4] D. I. Fotiadis, M. Boekholt, K. Jensen and W. Richter, Flow and Heat 
Transfer in CVD Reactors: Comparison of Raman Temperature Measure- 
ment and Finite Element Method Prediction, J. of Crystal Growth, vol. 
100, pp. 577-599, 1990. 

[5] D.G. Papageorgiou, I.N. Demetropoulos and I.E. Lagaris, Merlin-3.0, A 
Multidimensional Optimization Environment, Preprint 4-98 Dept. of Com- 
puter Science, University of Ioannina(1997). (To appear in Computer 



15 



Physics Communications Journal, 1998). The Merlin/MCL software is cur- 



[6] The Merlin Control Language for Strategic Optimization, D.G. Papageor- 
giou, I.N. Demetropoulos and I.E.Lagaris, Preprint 5-98 Dept. of Computer 
Science, University of Ioannina(1997). (To appear in Computer Physics 
Communications Journal, 1998). 

[7] K. Hornik, M. Stinchcombe and H. White, Multilayer Feedforward Net- 
works are Universal Approximators, Neural Networks vol. 2, pp. 359-366, 



[8] M. Leshno, V. Lin, A. Pinkus and S. Schocken, Multilayer Feedforaward 
Networks with Nonpolynomial Activation Function can Approximate any 
Function, Neural Networks vol. 6, pp. 861-867, 1993. 

[9] D. Kincaid and W. Cheney, Numerical Analysis, Brooks/Cole Publishing 
Company, 1991. 

[10] D. A. Karras and I. E. Lagaris, A Novel Neural Network Training Technique 
based on a Multi Algorithm Constrained Optimization Strategy, Preprint 
14-96, Department of Computer Science, University of Ioannina (1996). 

[11] A. Likas, D. A. Karras and I. E. Lagaris, Neural Network Training and 
Simulation Using a Multidimensional Optimization System, Int. J. of Com- 
puter Mathematics, to appear. 

[12] R. Fletcher, Practical methods of Optimization, second edition, Wiley, 




1989. 



1987. 



[13] E. Gallopoulos, private communication, September 1997. 



16 



